Goto

Collaborating Authors

 comparison result









Appendix for " Comprehensive Knowledge Distillation with Causal Intervention " A Implementation Details

Neural Information Processing Systems

CIFAR-10 is an image classification dataset. It contains 50,000 training images and 10,000 test images of 10 classes. We adopt the standard data augmentation strategy on CIFAR datasets, i.e., padding 4 pixels on each side of an image and randomly flipping it horizontally, and Tiny ImageNet is a subset of ImageNet. We adopt the standard data argumentation, i.e., padding 8 pixels on each side ImageNet is a large-scale image classification dataset containing 1.28 million training images and The standard augmentation [6, 4] is adopted. To save the cost, we do a very basic search instead of grid search.



Reply to Reviewer

Neural Information Processing Systems

We thank all reviewers for their valuable feedback and constructive suggestions. Major comments are addressed below. Several works (eg, [7] and [11]) follow a similar rationale. We thank the reviewer for suggesting these large-scale image datasets. Q1: What "evidence-based entropy" is when claiming entropy can be decomposed into vacuity and dissonance.